Classifying and Clustering Dialects of North American English

نویسنده

  • Keelan Evanini
چکیده

This paper presents the results of experiments in which machine learning techniques were applied to the problem of determining regional dialect boundaries. Specifically, decision trees classification and k-means clustering were applied to a corpus of phonetic measurements taken from a large survey of North American English vowels. Pairwise classification and clustering experiments were done for all combinations of ten dialect regions determined by dialectologists. The results show which of these dialect regions are most distinct and similar, suggesting which of the distinctions that are usually used by linguists are the most meaningful. Furthermore, the classification trees are analyzed to show which vowel formants are most informative for each dialect region.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Intrinsic vowel duration and the post-vocalic voicing effect: some evidence from dialects of north american English

We report the results of a comprehensive dialectal survey of three vowel duration phenomena in North American English: gross duration differences between dialects, the effect of postvocalic consonant voicing, and intrinsic vowel duration. Duration data, from HMM-based forced alignment of phones in the Atlas of North American English corpus [1], showed that 1) the post-vocalic voicing effect app...

متن کامل

A comparison of acoustic and articulatory methods for analyzing vowel differences across dialects: Data from American and Australian English.

In studies of dialect variation, the articulatory nature of vowels is sometimes inferred from formant values using the following heuristic: F1 is inversely correlated with tongue height and F2 is inversely correlated with tongue backness. This study compared vowel formants and corresponding lingual articulation in two dialects of English, standard North American English, and Australian English....

متن کامل

Vowel perception by listeners from different English dialects

Native English listeners from North America rely primarily on changes in formants, not vowel duration, when perceiving the vowel contrast in the minimal pair bit and beat manipulated from a Canadian English sample [5]. In this paper, we evaluated which cue do native English listeners from other regions use when perceiving the same North American vowel contrast. For this purpose, we used the sam...

متن کامل

Patterns of Assimilation Nasality in English as a Function of Vowel Height

Assimilation nasality patterns for high, mid and low vowels were studied in two dialects of North American English (Canadian & southeastern American). Native speakers (n=24) produced CVC, NVC, CVN and NVN tokens. The vowel portion of each oral and nasal acoustical signal was transduced by a Nasometer, digitized, and the degree of nasalance established as: % nasalance = nasal rms/(nasal + oral r...

متن کامل

Classifying English Documents by National Dialect

We investigate national dialect identification, the task of classifying English documents according to their country of origin. We use corpora of known national origin as a proxy for national dialect. In order to identify general (as opposed to corpus-specific) characteristics of national dialects of English, we make use of a variety of corpora of different sources, with inter-corpus variation ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008